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(54) Method and device for the processing of images based on morphable models 



(57) A method of processing an image of a three- 
dimensional object, comprising the steps of providing a 
morphable object model being derived from a plurality 
of 3D images, matching the morphable object model to 
at least one 2D object image, and providing the matched 
morphable object model as a 3D representation of the 
object. A method of generating a morphable object mod- 



el comprises the steps of generating a 3D database 
comprising a plurality of 3D images of prototype objects, 
subjecting the data of the 3D database to a data 
processing providing correspondences between the 
prototype objects and at least one reference object, and 
providing the morphable object model as a set of objects 
comprising linear combinations of the shapes and tex- 
tures of the prototype objects. 
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Kall^d ^f^f manipulat f n concerns P ar,icular V »» manipulation of human faces Modeling human faces 

toem^Xlu^Z^ , 9 K PhiCS SlnCe itS be9imin9 ' Since the pioneerin 9 w ° rk <* P^ke [23. 24. see 
3 " ! , re, erences], various techniques have been reported for modeling the geometry of faces 19 10 20 

WatefiS ammatm9 thSm [26 ' 13 ' 17 ' 29 ' 20 ' 35 ' 27], A detailed overview can be found in the boo^k ot Parke and 
[0003] The techniques developed for the animation of faces can be roughly separated in those that relv on nhv^l 

Zota^ceS' d mUSC ' eS , [35L ' hOSe aPPlyin9 prSVi0US,y captured '-ial expresses to a Ze l^ t^ 
^™7f based anima " on techniques compute the correspondence between the different facial expressions of a 

n P :mb°er b o y , ISS^K^^^BT * ^ "* 
genera changes in the appearance o, an individual can be de^rfcaSeMhorasm'^S^^nTeS^d.^ 

taton of t^Sjn 8 <=°"v,nc,ng mapping of motion data from .he reference to a new model, or the J^. 

» o, eTes and moZ " ma ' Chin9 * 6ChniqUeS ^ * ^ «* P ™™« P** such 

Kirt~ SeC ° nd ,VPe °' Pr0b ' 6m m ,aCS m0deMn9 is the Ration °< natural faces from non faces For this human 

fo^l, .hjVboveto h r^ e P ,'° Vide ™ Pr ° Ved ima9S prOCeSSin 9 me,hods and *■»*"• being capable 

eSe manneT " * Pr °° eSS ima9SS °' ,hree - dimens '°" al ° b '<*.s * a more flexible' and 

*° ™ S ° b ' eCt !f SO ' Ved by 3 rneth ° d and a system c °>"Prising the features of claim 1 7 12 13 and 14 reso 

fMOsT T^h" 1 f T n,S im P lementati °^ °' «>• invention are defined in the dependend cLms ' 
Sms P h- ° I ,nVemi ° n ' 3 Parame,ri ° ,ace modelin 9 ,echni « ue is P res «"'^ '"a, ass s s fn both above 
S^r.h ' f a,V hUman ,aC9S Can bS Crea,ed s ™u..aneously controlling the likelihood of thTgener^ed faces 

» s« of 30 ,^ ^ ? ab ' e 10 C ° mPU,e corres P° nde n= a between new faces. Ixploiting the sta stefo^ atml data 
tc 12 h ! (9e ° me,ric and lex,ural da 'a. Cyberware™) a morphab.e lace model has been buNt wh ch allows 

scans ctT™I T m0rPhln9 ,UnC,i ° n ,ha ' iS baSed on ,ne ,inear combination of a large number of 3D ace 

on 2. ^1 ? aV6,a9e ' aCe and lhS ma ' n m ° deS °" Varialion in ,he dalas a>. a probability dltribuTion if imp^ed 
o nn^enTss S ™ '° aV °? h Un,ikel * ,aces Als °< P a ~ic descriptions of face attributes suchTge^is 

!~: c h^,e^°o? L W e Sace 3 '~ ^ ^ **" by — * "» — °« 

5 z^^tes^^""" ,ac r and i,s ~— by - 
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is captured in our model function is sufficient to make reasonable estimates of the full 3D shape and texture of a face 
even when only a single picture is available. When applying the method to several images of a person, the reconstruc- 
tions reach almost the quality of laser scans. 

[0010] The key part of the invention is a generalized model of human faces. Similar to the approach of DeCarlos et 
5 al. [9], the range of allowable faces according to constraints derived from prototypical human faces is restricted. How- 
ever, instead of using a limited set of measurements and proportions between a set of facial landmarks, the densely 
sampled geometry of the exemplar faces obtained by laser scanning (Cyberware™) are directly used. The dense 
modeling of facial geometry (several thousand vertices per face) leads directly to a triangulation of the surface. Con- 
sequently, there is no need for variational surface interpolation techniques [9, 21 „ 30]. The inventors also added a 
io model of texture variations between faces. The morphable 3D face model is a consequent extension of the interpolation 
technique between face geometries, as introduced by Parke [24J. Computing correspondence between individual 3D 
face data automatically, the invention enables to increase the number of vertices used in the face representation from 
a few hundreds to tens of thousands. 

[0011] Moreover, a higher number of faces can be used, and thus between hundreds of 'basis' faces rather than just 
is a few can be interpolated. The goal of such an extended morphable face model is to represent any face as a linear 
combination of a limited basis set of face prototypes. Representing the face of an arbitrary person as a linear combi- 
nation (morph) of "prototype" faces was first f ormu lated for image compression in telecommunications [7]. I mage-based 
linear 2D face models that exploit large data sets of prototype faces were developed for face recognition and image 
coding [3, 16, 34]. 

20 [0012] Different approaches have been taken to automate the matching step necessary for building up morphable 
models. One class of techniques is based on optical flow algorithms [4, 3] and another on an active model matching 
strategy [11,15]. Combinations of both techniques have been applied to the problem of image matching [33]. According 
to the invention, an extension of this approach to the problem of matching 3D faces has been obtained. 
[001 3] The correspondence problem between different three-dimensional face data has been addressed previously 

25 by Lee ct al.[18]. Their shape-matching algorithm differs significantly from the invention in several respects. First, the 
correspondence is computed in high resolution, considering shape and texture data simultaneously. Second, instead 
of using a physical tissue model to constrain the range of allowed mesh deformations, the statistics of example faces 
are used to keep deformations plausible. Third, the system of the invention does not rely on routines that are specifically 
designed to detect the features exclusively found in human faces, e.g., eyes, nose. 

30 [0014] The general matching strategy of the invention can be used not only to adapt the morphable model to a 3D 
face scan, but also to 2D images of faces. Unlike a previous approach [32], the morphable 3D face model is now 
directly matched to images, avoiding the detour of generating intermediate 2D morphable image models. As an ad- 
vantageous consequence, head orientation, illumination conditions and other parameters can be free variables subject 
to optimization. It is sufficient to use rough estimates of their values as a starting point of the automated matching 

35 procedure. 

[0015] Most techniques for 'face cloning', the reconstruction of a 3D face model from one or more images, still rely 
on manual assistance for matching a deformable 3D face model to the images [24, 1, 28]. The approach of Pighin et 
al. [26] demonstrates the high realism that can be achieved for the synthesis of faces and facial expressions from 
photographs where several images of a face are matched to a single 3D face model. The automated matching proce- 
40 dure of the invention could be used to replace the manual initialization step, where several corre-sponding features 
have to be labeled in the presented images. 

[0016] One particular advantage of the invention is that it works directly on faces without manual markers. In the 
automated approach the number of markers is extended to its limit. It matches the full number of vertices available in 
the face model to images. The resulting dense correspondence fields can even capture changes in wrinkles and map 

45 these from one face to another. 

[0017] The invention teaches a new technique for modeling textured 3D faces. 3D faces can either be generated 
automatically from one or more photographs, or modeled directly through an intuitive user interface. Users are assisted 
in two key problems ol computer aided face modeling. First, new face images or new 3D face models can be registered 
automatically by computing dense one-to-one correspondence to an internal face model. Second, the approach reg- 

so ulates the naturalness of modeled faces avoiding faces with an "unlikely" appearance. 

[0018] Applications of the invention are in particular in the field of facial modeling, registration, photogrammetry, 
morphing, facial animation, and computer vision. 

[001 9] Further advantages and details of the invention are described with referenco to the attached drawings, which 
show: 

55 

Figure 1 : a schematic representation of basic principles of the invention, 

Figure 2: an illustration of the face synthesis on the basis of the morphable model, 
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Figure 3: an illustration of the variation of facial attributes of a single face, 

Figure 4: a flow chart illustrating the processing steps for reconstructing 3D shape and texture of a new face from 
a single image, 



Figure 5: a flow chart of the simultaneous reconstruction of a 3D shape and texture of a new face from two images, 

Figure 6: an illustration of the generation of new images with modified rendering parameters, 

10 Figure 7: an illustration of the reconstruction of a 3D face of Mona Lisa on the basis of the invention, and 

Figure 8: a schematic illustration of an image processing system according to the invention. 

« E 00 . 201 , AS i " us,ra,ed in R 9 ure 1 ■ starling from an example set of 3D face models, a morphable face model is derived 
ST, 9 ^ ShaP f ' eX,Ure °' ,he 6XampleS in, ° a Vec,or s P ace representation" The moXbletce model 
shanf Z W ° ma ' n St6PS in f3Ce mani P u,a,ion: < 1 > derivi "9 * 3D face model from a novel image and (2) ™dTty7ng 
Z H natUra ' WaV N6W ,8CeS and ex P ressions «" °e ^eled by forming lin JcombinaucTof 2 
prototypes. Shape and texture constraints derived from the statistics of our example faces are used to guide manual 

20 Z ° aU,0ma,ed ™ X T 9 a ' 9 ° ri,hmS 30 ' aCe rec ° n s«ruc.ions from single images and their applications "o 
pho.o-real,st,c ,mage manipulations can be obtained. Furthermore, face manipulations according to complex pa'am 
eters such as gender, fullness of a lace or its distinctiveness are demonstrated P 
[0021] The further description is structured as follows. It starts with a description (I) of the database of 3D face scans 
rom wh,ch our morphable mode, is built. In the fotlowing section (II), the concept of the morphabf fecfmc^ is 
.ntroduced. assummg a set of 3D face scans that are in full correspondence. Exptoiting the stafetics of a da^e a 

«~ r eddafa C s e , , l0 ^ 1 n , * " ** ^ « Pm M ™°°*»* ««*" 

TJiT, ! . parameter space of the model are mapped. In section III, a method for matching the flexible 

caf ciln, 6 H n ° Vel T 9BS ° r 30 SCanS °' ,aCeS ' S d6SCribed ' Al °"9 wi,h a 30 -construction the memoo 

a 1Z ""'espondence - based on the morphable model Section IV describes an iterative method for building 
a morphable model automattcally from a raw data set of 3D face scans when no correspondences between the exempt 
'0 faces are available Finally, applications of the technique to novel images will be shown exemplar 

IKin.™! T SCr ' Pt . i0n °' ' he me,h0d accordin 9 '° ,he invention refers to the attached figures. I. is emphasized that 

^«;rr sons (in a pa,ent appiica,ion) are n °» capabie ,o — •» ^ « °< - j£ 



3S I Database 



tT (Cyberware ™> ol 200 "eads of young adults (1 00 male and 1 00 female) were used for obtaining 
SZ'nST™ SCan f. Pr ° V,da h8ad S,rUC,ure da,a in a °y™<™< representation, with radii r(h, 0 of surtace 
v luls Rm of Gfh I anrt £2T * ^ 512 eqUa " y SPaCed V6rtiCal S,6PS h Addi «°"al.y, the RGB-colot 

w»r 8 b» Jer ^nef ' *' ^ ^ SPatia ' reS °' Uti ° n and were stored in a ,exture ™>P 

[0024] All faces were without makeup, accessories, and facial hair. The subjects were scanned wearing bathinc 
capMha were removed digitally. AddNional automatic pre-processing of the scans, which for most heads reared no 
human interaction, consisted of a vertical cut behind the ears, a horizontal cut to remove the shoulders and a no^ma° 
ization routine that brought each face to a standard orientation and position in space The resultant faces Ire °epre- 
sented by approximately 70.000 vertices and the same number of color values 



II Morphable 3D Face Model 



ITcf L.Ieen m a 0 | rP 0 Mhl e , m0del , iS ^ ° n 3 "* °' 3 ° ' a ° eS M ° rphin9 be,Ween ,aces "**»>™ full correspond- 

ZZZX, , SeCt '° n ' " iS aSSUmed ,hat a " exem P |ar faces are in ,u » correspondence The 

algorithm for computing correspondence will be described in Section IV. sponoence. ne 

[0026] We represent the geometry of a face with a shape-vector S= (X,, V, Z, X, Y Z |T = «.. , h ,. ,™, =i „ c 

InlY 68 °k ' S " VertiCeS ' SimP,iCi,y ' WS ™ S »* "dumber ofvatid £,u7 values n he^ e 

map is equal to the number of vertices. We therefore represent the texture of a face by a texture-vec.or T= TT 

fVf* -J** e "> S ,ha ' contains ,he «. O, B color values of the n corresponding vertices A morphable face 

ZTn T« C ° nS,rUC,ed ^ a da,a set °' m exemplar faces, each represented by its shape-vec.o™ andtetl re 
vector T, Since we assume all faces in ful. correspondence (see Section IV), new shapes and new te^s 
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T mode , can be expressed in barycentric coordinates as a linear combination of the shapes and textures of the m exemplar 

faces: 



S_ = £ °*S. . T mo « = £ biTi , £ <n ; = £ = 1. 



to [0027] We define the morphable model as the set of faces (S mod ( a), T mod (b)), parameterized by the coefficients a 
= (a 1a a 2 - a m) T and 5= tfe b m ) T . (Standard morphing between two faces (m = 2) is obtained if the parameters 
a v 6, are varied between 0 and 1 , setting a 2 = 1 - a, and ^ = 1 - 6,). 

Arbitrary new faces can be generated by varying the parameters a and 5 that control shape and texture. 
[0028] For a useful face synthesis system, it is important to be able to quantify the results in terms of their plausibility 
is of being faces. We therefore estimated the probability distribution for the coefficients a t and t> { from our example set of 
faces. This distribution enables us to control the likelihood of the coefficients a t and b t and consequently regulates the 
likelihood of the appearance of the generated faces. 

[0029] We fit a multivariate normal distribution to our data set of 200 faces based on the averages of shape S and 
texture t and the covariance matrices C s and C T computed over the shape and texture differences ASj = Sj - S and 
20 a Tj = 7j - T . 

[0030] A common technique for data compression known as Principal Component Analysis (PCA) [14] performs a 
basis transformation to an orthogonal coordinate system formed by the eigenvectors s x and t x of the covariance matrices 
(in descending order according to their eigenvalues): 

25 

Smodcl = S + Q * 5 » » T ^^t = T + ^ fai . (0 

»=l .=1 

30 

a. P The probability for coefficients a is given by 



35 

p(d)~exp(-i£(a,/cr 1 )- ! ], (2) 

40 

with a? being the eigenvalues of the shape covariance matrix C s . The probability ~ p(ft) is computed similarly. 
Segmented morphable model: The morphable model described in equation (1), has m - 1 degrees of freedom for 
texture and m - 1 for shape. The expressiveness of the model can be increased by dividing faces into independent 
subregions that are morphed independently, lor example into eyes, nose, mouth and a surrounding region (see Figure 

45 2). 

[0031] According to Figure 2, a single prototype adds a large variety of new faces to the morphable model. The 
deviation of the prototype from the average is added (+) or subtracted (-) from the average. A standard morph (*) is 
located hallway between the average and the prototype. Substracting the diflerences Irom the average yields an "anti"- 
face (U). Adding and substracting deviations independently from shape (S) and texture (T) on each of four segments 
50 produces a number of distinct faces. 

[0032] Since all faces are assumed to be in correspondence, it is sufficient to define these regions on a reference 
face. This segmentation is equivalent to subdividing the vector space of faces into independent subspaces. A complete 
3D face is generated by computing linear combinations for each segment separately and blending them at the borders 
according to an algorithm proposed for images by (6] . 

55 

11.1 Facial attributes 

[0033] Shape and texture coefficients a x and t> t in our morphable face model do not correspond to the facial attributes 
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used .n human language. While some facial attributes can easily be related to biophysical measurements [12 91 such 
IT," ,hS r° U,h ' OU I e,S SU ° h aS ,3Cial ,6minini,y ° r bein 9 more °< ,ess b °"V °an hardly be described by 
~ ,k 1°"' 3 f ° f mappi " 9 ' acial a,,ribu,es is de ^"bed. defined by a hand-labeled set of example 

s w ' T °' ° Ur m ° rphable model At eac " P°««i°n in face space (that is for any possible face) 

white Z£ a a!, Z ?T e , ? C,0fS * ha '' ^ add6d ,0 ° r SUb,raC ' ed ,r0m 3 ,aCe ' Wi " mani P ulate a ^.tribute 
while keeping all other attributes as constant as possible. 

SLiH^l am TH„ Chan9eS u n fa ° ial expression generated by performance based techniques [23] can be 
transferred by adding the differences between two expressions of the same individual 4S=S -S AT— 

Expression - 7"ne U t,ai. to a different individual in a neutral expression ' °"' e *»°<< °"^>ra" fl " 

'' fS,,^!? e xP»essions attributes that are invariant for each individual are more difficult to isolate. The 
^ ins an ^Z j "** 88 ' U " neSS °' ,aCSS ' °< double 

face Thean T C ° nCaVe "° SeS <Fi9Ure 3) R9Ure 3 illUStra,8S ,he varia,ion °< attributes of a single 

!o the ST" ^ ° r ' 9lna <With fram8> Ca " bS Chan96d ° r subs,rac,in 9 and texture vectors specHic 

" l~c!m Pl !te wlgn.ed S sums ,aCeS ^ ^ 3SSi9ned ' abe ' S " ""^i" 9 ^rkedness o, the attribute. 



" ™d s r U Jl PleS °' <A f; A7) , Ca , n "° W bC add ° d *° ° r SUb,raC,Cd ,rom an * ,ndMdual ,aco F °' bi nary attributes, such 

zzt: 9 ::^vz a s' faces in ctess A and »° - - ,/ma ,or *~ in « Eq < 3 > ^ *° J — 
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ff^? ^°J U 1« T me ' hod ', le 'r (S ' T) be ,he ° Vera " ,unction describi n9 the markedness of the attnbute in a face 

set o, labeted^lPsL^ r 3 , ^ *" *" <S ' n ^ 'W**'™ P rob, e™ of estimating p(S, 7) from a sample 

h ! hS PrSSent ,eChniqUe asSUmes ,hal P< S - D is 8 "near function Consequently 

ai ZrZ » T I T ^ °' ' he a "" bU,e - thSre iS ° nly a Si "9 ,e °P ,imal direc,i °n (AS. AT) for the whole space 
ol faces. It can be shown that Equation (3) yields the direction with minimal variance-normalized length 

HASII 2 M = (AS. C s 1 AS>, IIA71| 2 M = (AT, C T 'AT) 

S».»h ° f ,aCial a,tnbUte iS " S " d i st in«iveness". which is commonly manipulated in caricatures The 

automated production of cancatures has been possible for many years [5J. This technique can easi ly be extended 

avele 9 ^e PreSen ' m °^ able T * ^ Mvma ' ^ Canca,Ured b V inoreasing their distance from Z 
average face. In our representation, shape and texture coefficients a, Pl are multiplied by a constant factor or different 

III Matching a morphable model to images 

rnoTLnl* T*-? inVen " 0n iS an a ' 9 ° rithm '° r automa,ica,| y matching the morphable face mode, to one or 
more mages. Proving an estimate of the face's 3D structure (Figure 4), it closes the gap between the specific ma- 
nipulates descnbed .n Section 11.1. the type of data available in typical applications 

nTVw^roTRoureT ^ / 9e ° nS,ruC " n 9 30 sha P e a "d texture of a newface from a single image are illustrated 
in the flow chart of Figure 4. After a rough mannual alignment of the averaged 3D head (top row) the automated 
ma.ch.ng procedure fits the 3D morphab.e model to the image (center row). In L right column, the model is rlnde ed 
" It! Z ,ma9e ,. D6,ailS in ,ex,ure 'an be improved by il.umina.ion-corrected texture extraction from helnput 
(bottom row). This correction compr-ses a back-projection of the generated image to the input image with an illumination 
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correction. The color information from the original image is used for correcting the generated image. This illumination- 
correction by back -protect ion represents an important and advantageous feature of the invention. 
[0042] Coefficients of the 3D model are optimized along with a set of rendering parameters such that they produce 
an image as close as possible to the input image. In an analysis-by-synthesis loop, the algorithm creates a texture 

5 mapped 3D face from the current model parameters, renders an image, and updates the parameters according to the 
residual difference. It starts with the average head and with rendering parameters roughly estimated by the user. 
[0043] Model Parameters: Facial shape and texture are defined by coefficients ctj and pj, j= 1, ... m - 1 (Equation 
1). Rendering parameters p depend on the application and contain camera position (azimuth and elevation), object 
scale, image plane rotation and translation, intensity i rMm ^ i g . amt >. 'b.amb °* ambient f'9 nt . and/or intensity i rdir i g dip i b djr 

w of directed light. In order to handle photographs taken under a wide variety of conditions, p also includes color contrast 
as well as offset and gain in the red, green, and blue channel. Other parameters, such as camera distance, light 
direction, and surface shininess, remain fixed to the values estimated by the user or with an appropriate algorithm. 
[0044] From parameters (a, p\ p), colored images 

I mo**, (x, y) = 0 rmod (x, y), Ig^x, y). l bjnod (x, y)) T (4) 

are rendered using perspective projection and the Phong illumination model. The reconstructed image is supposed to 
be closest to the input image in terms of Euclidean distance 

20 

[0045] Matching a 3D surface to a given image is an ill-posed problem. Along with the desired solution, many non- 
25 face-like surfaces lead to the same image. It is therefore essential to impose constraints on the set of solutions. It is 
an essential advantage of the invention that in the present morphable model, shape and texture vectors are restricted 
to the vector space spanned by the database. Accordingly, non-face-like surfaces can be completely avoided. 
[0046] Within the vector space of faces, solutions can be further restricted by a tradeoff between matching quality 
and prior probabilities, using P(a), P(0) from Section 3 and an ad-hoc estimate of F\ p). In terms of Bayes decision 
30 theory, the problem is to find the set of parameters (a , J5 .p ) with maximum posterior probability, given an image \f fVut . 
While Tt , ft, and rendering parameters p completely determine the predicted image /, TX>rfe ./, the observed image \ in ^ ut 
may vary due to noise. For Gaussian noise with a standard deviation <s N , the likelihood to observe l„ put is p{l inpu ^ a , 
0,p ) — exp [-1/2o 2 N ■ £,]. Maximum posterior probability is then achieved by minimizing the cost function 

35 



[0047] The optimization algorithm described below uses an estimate of E based on a random selection of surface 
points. Predicted color values l motfe / are easiest to evaluate in the centers of triangles. In the center of triangle k, texture 
45 (R k> g Ri B k )T anc j 3D location (X k> Y k: 2 k ) r are averages of the values at the comers. Perspective projection maps 
these points to image locations (p xk . P y .k) T - Surface normals n k of each triangle k are determined by the 3D locations 
of the corners. According to Phong illumination, the color components l nmo< j e t, / 5>mode/ and /b.mode/ take the form 

50 'rsnod** = <'r.amb + 'r.dir ' ( n k'» + ■ (r k V k ) V (6) 

where I is the direction of illumination, v k the normalized difference of camera position and the position of the triangle's 
center, and r k = 2(nl)n - 1 the direction of the reflected ray. s denotes surface shininess, and v controls the angular 
distribution of the specular reflection. Equation (6) reduces to l nmode i k = i r .amt^k if a shadow is cast on the center of 
55 the triangle, which is tested in a method described below. 

[0048] For high resolution 3D meshes, variations in I model across each triangle , ...,n t } are small, so £,may be 
approximated by 
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10 



where a k is the image area covered by triangle k. If the triangle is occluded a k = 0 

[0049] In gradient descent, contributions from different triangles of the mesh would be redundant In each iteration 
we therefore select a random subset /C<= [i,...n t} of 40 triangles Hand replace E,by ' 



IS 



Bk - J2 III-P«.«(P.. t ,p ! ,. t ) - I,»* CI .*)||-\ (7 ) 

*6IC 



HUH?! „ The , probabili, y °< selectln 9 * » PC c K) ~ a„. This method ol stochastic gradient descent [15] is not onlv 

mil r^SH^"? bu ' a 'r he ' ps '° avoid local minima by addin9 noise to ,he 9radient 

» It , °>r ° nCe ^ 1000 S,6PS ' the me,hod c ° m P ut6s ,ne '«» 3D ^ape ol the current 

^ffl P h ^ ^ ° f a " V8r,iCeS " ,hen delermin <* »«. and detects hidden surlaces and cast shadows 

Uerl,^ " " iqUe ^ aSSUme tha ' ° CdUSi0nS and CaSt ShadOWS are c ° asta "' «"-9 each subset" 

[00S2] Parameters are updated depending on analytical derivatives of the cost function fusing a--* a- \- -BSda 
and similarly for and and P/ with suitable factors c.us.ng a, -» a, A, dt/dsy, 

«» [0063] Derivatives ol texture and shape (Equation'l) yield derivatives of 2D locations < Pltk p„ k r surface normals 
^pec.s COarSe " t0 " Fine: '" ° rd8r '° aV ° id IOCa ' m ' nima ' thS a ' 90rithm ,ollows a coarse-to-fine strategy in several 

30 

mo^hVble'mode 0 ' i,era "° nS " ° n ' d0wn " sam P |ed ver£i °" « ^ >W image with a low resolution 

aM ™l bV ° P ', imiZ k 9 ° nly firSt COefficien,s «V a " d Py controlling the first principal components, along with 
all parameters P/ In subsequent iterations, more and more principal components are added. 

c) Starting with a relatively strong weight on prior probability in equation (5), which ties the optimum towards the 
pr,or expectance value, we reduce this weigh, (or equivalent* o„> to obtain maximum maTcSg qualj 

ficiems aand T^' f"-*? ■T" iS bf ° ken ' n, ° S " men,S < SeCti ° n "» With Parameters p, fixed, coef- 

<s l ™*LllTn >m T? " iS f rai9hflorward to ex,end lh * "*chnique to the case where several images of a person 
are available (Figure 5). Figure 5 illustrates a simultaneous reconstruction ol 3D shape and texture ol a new face fram 
two images taken under different conditions in the center row. the 3D face is rendered on To "he inpu UmagT 
w ? o„ dem ° nStra,eS an essen,ial advanta 9« °< ** invention. The image processing methoS can be imp emen.ed 
with one or more mpu. images. There are no restrictions wilh regard to the imaging conditions of the inpu imaaes 

so z^r*^:;™ a9ainst ,he 30 — « - — - s- ««■ s 

[TOSS] While shape and texture are still described by a common set of a, and «, there is now a separate set of e-for 

55 fSmh^^ 0 ?" 00 T* T? TeX, ° re E!rtraetion: S P ecific '*»ures of individual faces tha. are no, captured by 
2atl«ti r ^ V* eX,raC,ed fr ° m ,hS ima9S in a ^sequent texture adaptation process 
I TZ 12 h^hf . k 965 ' S 3 ,eChn ' qUe WidSly US6d in cons ' ruc ''"9 3D models from images (e.g. [26]) However 
ZuTrl „ b h h r 9S P ° Se and illumination ' • te im P°^nt to separate pure albedo a. any given pint from the 
influence of shadmg and cast shadows in the image, in the inventive approach, this can be achieved becausT he 
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matching procedure provides an estimate of 3D shape, pose, and illumination conditions. Subsequent to matching, 
we compare the prediction 1,^^, rfor each vertex / with \ input (p xh p y j). and compute the minimum change in texture (R F 
Gj, B,) that accounts tor the difference. In areas occluded in the image, we rely on the prediction made by the model 
Data from multiple images can be blended using methods similar to [26]. 

5 

111.1 Matching a morphable model to 3D scans 

[0058] The method described above can also be applied to register new 3D faces. Analogous to images, where 
perspective projection P : R 3 —> R 2 and an illumination model define a colored image \(x, y) ~ (R(x, y), G(x, y), B(x, 
10 y)) T > laser scans provide a two-dimensional cylindrical parameterization of the surface by means of a mapping C : R 3 
-> (x,y t z) -> (h, <J>). 
Hence, a scan can be represented as 

75 I (h, +) = R(h,+),G(h,V,B(hA),r(h.W T - (8) 

[0059] In a face (S,T), defined by shape and texture coefficients a y and p ; (Equation 1), vertex / with texture values 
(Rj, Gj, BJ and cylindrical coordinates (r v h h <J>^ is mapped to \ mode i[h i , = (R, Gj, B r fj) T . 
[0060] The matching algorithm from the previous section now determines a,- and minimizing 
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IV Building a morphable model 

[0061] In this section, it is described how to build the morphable model from a set of unregistered 3D prototypes, 

30 and to add a new face to the existing morphable model, increasing its dimensionality. 

[0062] The key problem is to compute a dense point-to-point correspondence between the vertices of the faces. 
Since the method described in Section 1 11.1 finds the best match of a given face only within the range of the morphable 
model, it cannot add new dimensions to the vector space of faces. To determine residual deviations between a novel 
face and the best match within the model, as well as to set unregistered prototypes in correspondence, we use an 

35 optic flow algorithm that computes correspondence between two faces without the need of a morphable model [32]. 
The following section summarizes the technique as adapted to the invention. 

IV.1 3D Correspondence using Optical Flow 

40 [0063] Initially designed to find corresponding points in grey-level images l(x, y), a gradient-based optic flow algorithm 
is modified to establish correspondence between a pair of 3D scans \(h, $) (Equation 8), taking into account color and 
radius values simultaneously. The algorithm computes a flow field (6h(h,(t>).5<Ji(h,<ti)) that minimizes differences of III., 
(h,$) - \ 2 (h + bh($+ fyjll in a norm that weights variations in texture and shape. Surface properties from differential 
geometry, such as mean curvature, may be used as additional components in \ (h£). 

[0064] On facial regions with little structure in texture and shape, such as forehead and cheeks, the results of the 
optical flow algorithm are sometimes spurious. We therefore perform a smooth interpolation based on simulated re- 
laxation of a system of flow vectors that are coupled with their neighbors. The quadratic coupling potential is equal for 
all flow vectors. On high-contrast areas, components of flow vectors orthogonal to edges are bound to the result of the 
previous optic flow computation. The system is otherwise free to take on a smooth minimum-energy arrangement. 
Unlike simple filtering routines, our technique fully retains matching quality wherever the flow field is reliable. Optical 
flow and smooth interpolation are computed on several consecutive levels of resolution. 

[0065] Constructing a morphable face model from a set of unregistered 3D scans requires the compulation of the 
flow fields between each face and an arbitrary reference face. Given a definition of shape and texture vectors S m ,and 
T fe , for the reference face, S and Tfor each face in the database can be obtained by means of the point-to-point 
55 correspondence provided by (Sh(h,$),dq>(h,$)). 
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IV.2 Further improving the model 

I™!L,h o e ™ 86 ° P ! i , C "° W al9 ° ri,tlm does not Corporate any con-straints on the set of solutions, it could fail on 
some of the more unusual faces ,n the database. Therefore, we modified a bootstrapping algorithm to i.eratively improve 

T^TTT ° n b3SiS °' 3 meth0d ,ha ' h3S bSen US6d P reviousl ^ "> build «n«r image models [331 

f^e and ^r^?"*™ *T SUPP0 ! e,hat 3 " eXiS ' in9 m ° rPhablS m ° del iS n °' poweriul en °"9 b <° ™** a new 
he (inad™^" correspondence wdh it. The idea is first to find rough correspondences to the novel face using 

Ouisi sin ™? P 16 81 T d ' hen '° impr ° Ve ' heSe co " es P°n^noes by using an optical flow algorithm 
£S?IJ fST 9 r 8,1 3,t " rary ,aCS 35 mS ' emp0ra,y reference ' Penary correspondence between all other 
ind me re,e ' enCe ' S C j> mputed usi " 9 lhe °P tic «™ algorithm. On the basis of these correspondences shape 

m cl ^T 7 hk Sand TCan bS COmPUted Th6if aVera " Sewes as a new re,ere nce face. The first mo phable 
model ,s hen formed by the most significant components as provided by a standard PCA decomposition The curmm 
morphable mode, ,s now matched to each of the 3D faces according to the method described in Secfen III 1 Then 
ohatT ,l ° W 1 a ^ orit / ,m computes correspondence between the 3D face and the approximation provided by L ^ 

olwl th. ? , *" COrres P onde " ce im P lied * »e matched model, thte defines a new correspondence 
between lhe reference lace and the example. mhuhuoiiw 

^nlJnT^ZVT P '° C ^"' e W,ln mcreasj ng expressive power of the model (by increasing the number of principal 
rZ^Ze ^T COrreSp ° ndenCeS be,ween re(ere n- <*™ and the examp.es, andfinally toa complete 

V Image Processing System 

Sllv °' 1 b ? sicconfi 9 ura,i °n of an image processing system according to the invention is sche- 

matically lustrated In Figure 8. The image processing system 10 contains a 3D database 20, a model processor 30 

80 bZ c C ' , h a :° bjeCt ana ' yZCr 5 °' 9 ^W*™ <*«"« 60, a modeler circuit 70 and a 3D output cTrcurt 
fit n^- T 3 C ° mPU,ef 9raphiC renderi " 9 en9ine 803 and/or a CAD SVS*™ 80b. Further details of an 

2?: y Tj:::°;czr not shown which as such are known (e 9 con,ro,i,n9 — ^ ^ — 

30 frfma ,,21?° TIT 20 COn,ains tne s,ruc,ure da,a ot a P lura '«y °< objects (e.g. human faces) being obtained 
"™ SU " able ° P " Ca l° b ' e,eC " 0n ' 6 9 ° n * he baSiS °' laser scans The 30 da,aba « a 20 * connected with the 

lit, ,' £ P f ° cessor 30 de "^rs in particular an average face (e.g. like in figure 4, top row right) to the 

one or " re ' erenCe 1)3,3 ' 0 ^ m ° d8l6r drCUit 70 The 20 in P ut «*«« is adapted * ece ve 

one or more mput (mages ,n an appropriate format, e.g. photographs, synthesized images or the like The 2D inou! 

3o7eT«Z::!«T ^ 50 ma,Chin9 ,he m ° rphable model receivad from me mode, pfoc sso 

30 the input .mage(s). As a result, the object analyzer 50 generates a 3D mode, of the input image which is delivered 

"L"™^ cir f 60 <* di - c <* '° modeler circuit 70 or to the 3D output cfrcui, 80 On the baste of he 
bacK^t TJ?* ^ 5 ° and ' he ° ri9inal C °'° r data received ,rarn « ha 2D input circuit £ he 

.c * T ' T™ 60 3 m0dSl COrreCti ° n aS ou,lined above The c °"eo,ed model is delivered to he 

*> modeler crcurt 70 or d.rectly to the 3D output circuit 80 Finally, the modeler circuit 70 is adapted to introduce amended 
facial features to the (corrected) 3D model using the input of the model processor 30 as outlined aCe 

VI Results and modifications 

45 L^?L ACCOrd ! n9 , to ,he invention a morphable face model has been built by automatically establishing correspond- 
ence between all of e.g. 200 exemplar faces. The interactive face modeling system enables human users to create 
new characters and to modify facia, attributes by varying the model coefficients. The modifying^aciaZlZs comprise 

orfor 9 oZ 9 h r; T" 9 Wei9 , h '' ' r ° Wmn9 °' SmMin9 " 6Ven " b6in9 '° rCed l ° Smile - Wilhi n ^ constraints tmpZd by 
so natural '* 3 96 °* P ° SSib ' e ' aC6S ' ^ a " Unear com °inations of the exemplar faces look 

" ^ expressive power of the morphable model has been tested by automatically reconstructing 3D faces from 

bv us S H T ^ aUC c aSia " faC6S °* midd ' e a9S ,ha ' Were n °' in the da,abase The >™9°* "ere either taken 
™ ? 9 9 " a ' Cam ° ra (F ' 9UreS 4 ' 5) ' ° r taken undor arbilrai V unk "own conditions (Figure 6) 
ss ™™l mat examples we matched a morphable model built from the first 100 shape and the first 1 00 texture principal 

4 X^F otr^ThTt I"" Wh0 ' e da,aSe * °' 200 ' aCeS EaCh COmp0ne ' 1, Was additi °" a,, V -gmentedTn 

compu,^o e n're las 5 T o e m; h L °es ,T,a,Ch,n9 "~ Pert ° me<1 * 1 * °" ™ ^ 010000 

[0075] Reconstructing the true 3D shape and texture of a face from a single image is an ill-posed problem. However. 
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to human observers who also know only the input image, the results obtained with our method look correct. When 
compared with a real image of the rotated face, differences usually become only visible for large rotations of more than 
60°. 

[0076] There is a wide variety of applications for 3D face reconstruction from 2D images. As demonstrated in Figure 

5 6 the results can be used for automatic post -processing a face within the original picture or movie sequence. 

[0077] Knowing the 3D shape of a face in an image provides a segmentation of the image into face area and back- 
ground. The face can be combined with other 3D graphic objects, such as glasses or hats, and then be rendered in 
front of the background, computing cast shadows or new illumination conditions (Fig. 6). Furthermore, we can change 
the appearance of the face by adding or subtracting specific attributes. If previously unseen backgrounds become 

io visible, the holes can be filled with neighboring background pixels. 

[0078] We also applied the method to paintings such as Leonardo's Mona Lisa (Figure 7). Figure 7 illustrate a re- 
constructed 3D face of Mona Lisa (top center and right). For modifying the illumination, color differences (bottom left) 
are computed on the 3D face, and then added to the painting (bottom center). Additional warping generated new 
orientations (bottom right). Illumination-corrected texture extraction, however, is difficult here, due to unusual (maybe 

15 unrealistic) lighting. We therefore apply a different method for transferring all details of the painting to novel views. For 
new illumination (Figure 7, bottom center), we render two images of the reconstructed 3D face with different illumination, 
and add differences in pixel values (Figure 7, bottom left) to the painting. For a new pose (bottom right), differences in 
shading are transferred in a similar way, and the painting is then warped according to the 2D projections of 3D vertex 
displacements of the reconstructed shape. 

20 [0079] According to the invention the basic components for a fully automated face modeling system based on prior 
knowledge about the possible appearances of faces are presented. Further extensions are contemplated under the 
following aspects: 

[0080] Issues of implementation: We plan to speed up our matching method by implementing a simplified Newton- 
method for minimizing the cost function (Equation 5). Instead of the time consuming computation of derivatives for 
2S each iteration step, a global mapping of the matching error into the parameter space can bo used [8J. 

[0081] Data reduction applied to shape and texture data will reduce redundancy of our representation, saving addi- 
tional computation time. 

[0082] Extending the database: While the current database is sufficient to model Caucasian faces of middle age, 
we would like to extend it to children, to elderly people as well as to other races. 
30 [0083] We also plan to incorporate additional 3D face examples representing the time course of facial expressions 
and visemes. the face variations during speech. 

[0084] The laser scanning technology is to be extended to the collection of dynamical 3D face data. The development 
of fast optical 3D digitizers [25] will allow us to apply the method to streams of 3D data during speech and facial 
expressions. 

35 [0085] Extending the face model: The current morphable model for human faces is restricted to the face area, 
because a sufficient 3D model of hair cannot be obtained with our laser scanner. For animation, the missing part of 
the head can be automatically replaced by a standard hair style or a hat, or by hair that is modeled using interactive 
manual segmentation and adaptation to a 3D model [28, 26). Automated reconstruction of hair styles from images is 
one of the future challenges. 

40 [0086] The invention can be used with advantage in the field of image recognition. From a matched face model, the 
coefficients are used as a coding of the respective face. An image to be investigated is identified as this face if the 
coefficients corresponding to the image are identical or similar to the coding coefficients of the model. 
[0087] Further applications of the invention are given in the field of modelling images of three-dimensional objects 
other than human faces. These objects comprise e.g. complete human bodies, bodies or faces from animals, technical 

45 objects (as cars, furniture) and the like. 
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Claims 

1 . Method of processing an image ot a three-dimensional object, comprising the steps of 

5 - providing a morphable object model being derived from a plurality of 3D images, 

matching the morphable object model to at least one 2D object image, and 
providing the matched morphable object model as a 3D representation of the object. 

2. Method according to claim 1 , wherein the matched object model is subjected to a back-projection to color data of 
io the 2D input image of the object. 

3. Method according to claim 2, wherein the back- project ion yields an illumination correction for obtaining color data 
of the surface of the object. 

15 4. Method according to claim 3, wherein the color corrected data are subjected to an adapation to changed illumination 
conditions. 

5. Method according to one of the foregoing claims, wherein the matched morphable object model is subjected to a 
modelling step for modifying at least one object feature. 



20 



25 



6. Method according to one of the foregoing claims, wherein the objects comprise human faces, animal faces, human 
bodies, animal bodies or technical objects. 

7. Method of generating a morphable object model, comprising the steps of 



generating a 3D database comprising a plurality of 3D images of prototype objects, 

subjecting the data of the 3D database to a data processing providing correspondences between the prototype 
objects and at least one reference object, and 

providing the morphable object model as a set of objects comprising linear combinations of the shapes and 
30 textures of the prototype objects. 

8. Method according to claim 7, wherein the reference object is represented by average object data. 

9. Method according to claim 7 or 8, wherein the set of objects is parameterized with the coefficients of the linear 
35 combinations and a probability distribution of the coefficients is estimated. 

10. Method according to one of the claims 7 to 9, wherein the morphable object model is generated for a segment of 
the object. 

40 11. Method according to one of the claims 7 to 10, wherein the objects are human faces, animal faces, human bodies, 
animal bodies or technical objects. 

12. Method of recognizing an object, wherein a 3D model of the object to be recognized is processed with a method 
according to one of the foregoing claims. 

45 

1 3. Method of synthesizing a 3D model of a face with certain facial attributes with the method according to one of the 
claims 1 to 1 1 . 

14. Image processing system (10) for processing 3D images, comprising a 3D database (20), a model processor (30), 
so a 2D input circuit (40), an object analyzer (50), and a3D output circuit (80). 

15. System according to claim 14, which further comprises a back-protection circuit (60) and/or a modeler circuit (70). 

16. System according to claim 14 or 15, wherein the 3D output circuit (80) comprises a computer graphic rendering 
ss engine and/or a CAD processor. 
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